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We propose an information-theoretic framework for analyzing control systems based on the close 
relationship of controllers to communication channels. A communication channel takes an input 
state and transforms it into an output state. A controller, similarly, takes the initial state of a sys- 
tem to be controlled and transforms it into a target state. In this sense, a controller can be thought 
of as an actuation channel that acts on inputs to produce desired outputs. In this transformation 
process, two different control strategies can be adopted: (i) the controller applies an actuation dy- 
namics that is independent of the state of the system to be controlled (open-loop control); or (ii) 
the controller enacts an actuation dynamics that is based on some information about the state of 
the controlled system (closed- loop control). Using this communication channel model of control, we 
provide necessary and sufficient conditions for a system to be perfectly controllable and perfectly 
observable in terms of information and entropy. In addition, we derive a quantitative trade-off be- 
tween the amount of information gathered by a closed-loop controller and its relative performance 
advantage over an open-loop controller in stabilizing a system. This work supplements earlier results 
[H. Touchette, S. Lloyd, Phys. Rev. Lett. 84, 1156 (2000)] by providing new derivations of the ad- 
vantage afforded by closed-loop control and by proposing an information-based optimality criterion 
for control systems. New applications of this approach pertaining to proportional controllers, and 
the control of chaotic maps are also presented. 
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I. INTRODUCTION 

It is common in studying controllers to describe the 
interplay between the sensors which estimate the state 
of a system intended to be controlled, and the actuators 
used to actually modify the dynamics of the controlled 
system as a transfer of information involving three steps: 
estimation, decision, and actuation. In the first step, sen- 
sors are used to gather information from the controlled 
system in the form of data relative to its state (estima- 
tion step). This information is then processed according 
to some plan or control strategy in order to determine 
which control dynamics is to be applied (decision step), 
to be finally transferred to the actuators which feed the 
processed information back to the controlled system to 
modify its dynamics, typically with the goal of decreas- 
ing the uncertainty in the value of the system's variables 
(actuation step) [1-3]. 

Whether or not the estimation step is present in this 
sequence is optional, and determines which type of con- 
trol strategy is used. In so-called closed-loop or feedback 
control techniques, actuators rely explicitly on the infor- 
mation provided by sensors to apply the actuation dy- 
namics, whereas in open-loop control there is no estima- 
tion step preceding the actuation step. In other words, 
an open-loop controller distinguishes itself from a closed- 
loop controller in that it does not need a continual input 
of 'selective' information [4] to work: like a throttle or 
a hand brake, it implements a control action indepen- 
dently of the state of the controlled system. In this re- 



spect, open- loop control techniques represent a subclass 
of closed-loop controls that neglect the information made 
available by estimation. 

Since control is fundamentally about information (get- 
ting it, processing it, and applying it) it is perhaps sur- 
prising to note that few efforts have been made to develop 
a quantitative theory of controllers focused on a clear 
and rigorous definition of information. Indeed, although 
controllers have been described by numerous authors as 
information gathering and using systems (see, e.g., [1, 5- 
7]), and despite many results related to this problem 
[8-22] , there exists at present no general information- 
theoretic formalism characterizing the exchange of infor- 
mation between a controlled system and a controller, and 
more importantly, which allows for the assignation of a 
definite value of information in control processes [23, 24]. 
To address this deficiency, we present in this paper with 
a quantitative study of the role of information in control. 
The basis of the results presented here was first elabo- 
rated first in [25] , and draws upon the work of several of 
the papers cited above by bringing together some aspects 
of dynamical systems, information theory, in addition to 
probabilistic networks to construct control models in the 
context of which quantities analogous to entropy can be 
defined. 

Central to our approach is the notion of a communica- 
tion channel, and its extension to the idea of control chan- 
nels. As originally proposed by Shannon [26], a (memo- 
ryless) communication channel can be represented math- 
ematically by a probability transition matrix, say p{y\x), 
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relating the two random variables X and Y which are 
interpreted, respectively, as the input and the output of 
the channel. In the next two sections of the present work, 
we adapt this common probabilistic picture of communi- 
cation engineering to describe the operation of a basic 
control setup, composed of a sensor linked to an actu- 
ator, in terms of two channels: one coupling the initial 
state of the system to be controlled and the state of the 
sensor (sensor channel), and another one describing the 
state evolution of the controlled system as influenced by 
the sensor-actuator's states (actuation channel). 

In Sections IV and V, we use this model in conjunction 
with the properties of entropy-like quantities to exhibit 
fundamental results pertaining to control systems. As a 
first of these results, we show that the classical definition 
of controllability, a concept well-known to the field of con- 
trol theory, can be rephrased in an information-theoretic 
fashion. This definition is used, in turn, to show that a 
system is perfectly controllable upon the application of 
controls if, and only if, the target state of that system 
is statistically independent of any other external systems 
playing the role of noise sources. A similar information- 
theoretic result is also derived for the complementary 
concept of observability. Moreover, we provide bounds 
on the amount of information a feedback controller must 
gather in order to stabilize the state of a system. More 
precisely, we prove that the amount of information gath- 
ered by the controller must be bounded below by the dif- 
ference Ai/cioscd - AiJ™^^, where ATJciosod is the closed- 
loop entropy reduction that results from utilizing infor- 
mation in the control process, and ATJ™^^ is the max- 
imum decrease of entropy attainable when restricted to 
open-loop control techniques. This last result, as we will 
see, can be used to define an information-based optimal- 
ity criterion for control systems. 

The idea of reducing the entropy of a system using in- 
formation gathered from estimating its state is not novel 
by itself. Indeed, as he wondered about the validity of 
the second law of thermodynamics, the physicist James 
Clerk Maxwell was probably the first to imagine in 1897 
a device (or a 'demon' as it was later called) whose task 
is to reduce the entropy of a gas using information about 
the positions and velocities of the particles forming the 
gas. (See [27] for a description of Maxwell's demon and 
a guide to this subject's literature.) In the more specific 
context of control theory, the problem of reducing the en- 
tropy of a dynamical system has also been investigated, 
notably by Poplavskii [10, 11] and by Weidemann [9]. 
Poplavskii analyzed the information gathered by sensors 
in terms of Brillouin's notion of negentropy [27, 28], and 
derived a series of physical limits to control. His study fo- 
cuses on the sensor part of controllers, leaving aside the 
actuation process which, as will be shown, can be also 
treated in an information-theoretic fashion. In a similar 
way, Weidemann performed an information-based analy- 
sis of a class of linear controllers having measure preserv- 
ing sensors. Other related ideas and results can be found 
in Refs. [8, 12-22] . 
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FIG. 1: Directed acyclic graphs representing a basic control 
process, (a) Full control system with a sensor S and an actua- 
tor A. (b) Reduced closed-loop diagram obtained by merging 
the sensor and the actuator into a single controller device, the 
controller, (c) Reduced open-loop control diagram, (d) Single 
actuation channel enacted by the controller's state C = c. 

In the present paper, we build on these studies and go 
further by presenting results which apply equally to lin- 
ear and nonlinear systems, and can be generalized with 
the aid of a few modifications to encompass continuous- 
space systems as well as continuous-time dynamics. To 
illustrate this scope of applications, we study in Section 
VI specific examples of control systems. Among these, we 
consider two variants of proportional controllers, which 
play a predominant role in the design of present-day con- 
trollers, in addition to complete our numerical investi- 
gation of noise-perturbed chaotic controllers initiated in 
[25]. Finally, we remark in Section VII on the relation- 
ship of our framework with thermodynamics and optimal 
control theory. 

II. CHANNEL-LIKE MODELS OF CONTROL 

In this section, we introduce a simple control model 
that allows investigation of the dynamical interplay that 
exists between a sensor and an actuator to 'move' a sys- 
tem from an unknown initial state to a desired final target 
state. Such a process is depicted schematically in Figure 
1 in the form of directed acyclic graphs, also known as 
Bayesian networks [29, 30]. The vertices of these graphs 
correspond to random variables representing the state of 
a (classical) system; the arrows give the probabilistic de- 
pendencies among the random variables according to the 
general decomposition 

N 

p{xi,X2, ■ ■ - jXn) = ]^p(xi|7r[Xi]), (1) 

1=1 

where 7r[Xi] is the set of random variables which are di- 
rect parents of Xi, i = 1,2, ...,iV, {tt[Xi] = 0). The 
acyclic condition of the graphs ensures that no vertex is 
a descendant or an ancestor of itself, in which case we 
can order the vertices chronologically, i.e., from ances- 
tors to descendants. This defines a causal ordering, and. 
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consequently, a time line directed on the graphs from left 
to right. 

In the control graph of Figure la, the random vari- 
able X represents the initial state of the system to be 
controlled, and whose values x G A" are drawn according 
to a fixed probability distribution px (x) . In conformity 
with our introductory description of controllers, this ini- 
tial state is controlled to a final state X' with state values 
x' £ X hy means of a sensor, of state variable S, and an 
actuator whose state variable A influences the transition 
from X to X'. For simplicity, all the random variables 
describing the different systems are taken to be discrete 
random variables with finite sets of outcomes. The exten- 
sion to continuous-state systems is discussed in Section 
IV. Also, to further simplify the analysis of this model, 
we assume throughout this paper that the sensor and 
the actuator are merged into a single device, called the 
controller, which fulfills both the roles of estimation and 
actuation (see Figure lb). The state of the controller is 
denoted by C, and assumes values from some set C of 
admissible controls [31]. 

Using this notation together with the decomposition 
of Eq.(l), the joint distribution p{x, x' , c) describing the 
causal dependencies between the states of the control 
graphs can now be constructed. For instance, the com- 
plete joint distribution corresponding to the closed-loop 
graph of Figure lb is written as 

p(a;,a;',c)ciosed = Px{x)p{c\x)p{x'\x,c), (2) 

while the open-loop version of this graph, depicted in 
Figure Ic, is characterized by a joint distribution of the 
form 

p{x,x',C)open = Px {x)pc{c)p{x' \x, c) . (3) 

Following the definition of closed- and open-loop con- 
trol given above, what distinguishes probabilistically and 
graphically both control strategies is the presence, for 
closed-loop control, of a direct correlation link between X 
and C represented by the conditional probability p{c\x). 
This correlation can be thought of as a (possibly noisy) 
communication channel, referred here to as the sensor 
or measurement channel, that enables the controller to 
gather an amount of information identified formally with 
the mutual information 

I{X-C)= Pxc{x,c)\og ^^I'^^'f. , (4) 

xet^ec Px{x)pc{c) 

where px,c(x,c) = px(x)p{c\x). (All logarithms are as- 
sumed to the base 2, except where explicitly noted.) Re- 
call that I{X;C) > with equality if and only if the 
random variables X and C are statistically independent 
[32] , so that in view of this quantity we are naturally led 
to define open-loop control with the requirement that 
I{X\C) = 0; closed- loop control, on the other hand, 
must be such that I{X; C) ^ 0. 



As for the actuation part of the control process, the 
joint distributions of Eqs.(2)-(3) show that it is ac- 
counted for by the channel-like probability transition ma- 
trix p(x'\x,c). The entries of this actuation matrix give 
the probability that the controlled system in state X = x 
is actuated to X' = x' given that the controller's state is 
C = c. From here on, it will be convenient to think of 
the control actions indexed by each value of C as a set of 
actuation channels, with memoryless transition matrices 

p{x'\x)c=p{x'\x,c), (5) 

governing the transmission of the random variable X to 
a target state X'. In terms of the control graphs, such 
channels are represented in the same form as in Figure 
Id to show that the fixed value C = c (filled circle in the 
graph) enacts a transformation of the random variable X 
(open circle) to a yet unspecified value associated with 
the random variable X' (open circle as well). Guided by 
this graphical representation, we will show in the next 
section that the overall action of a controller can be de- 
composed into a series of single conditional actuation ac- 
tions or subdynamics triggered by the internal state of 
C. 

Here we characterize the effect of the subdynamics 
available to a controller on the entropy of the initial state 
X: 

H{X) = -Ypx{x)logpx{x). (6) 
xex 

In theory, this effect is completely determined by the 
choice of the initial state X , and the form of the actua- 
tion matrices. The effect of these two 'variables' on H{X) 
is categorized according to the three following classes of 
dynamics: 

One-to-one transitions: A given control subdynamics 
specified by C = c conserves the entropy of the initial 
state X if the corresponding probability matrix p{x'\x)c 
is that of a noiseless channel. Permutations or transla- 
tions of X are examples of this sort of dynamics. 

Many-to-one transitions: A control channel p{x'\x)c 
may cause some subset Xc of the state space X to be 
mapped onto a smaller subset of values for X' . In this 
case, the corresponding subdynamics is said to be dissi- 
pative or volume- contracting as it decreases the entropy 
of ensembles of states lying in Xc- 

One-to-many transitions: A channel p{x'\x)c can also 
lead H{X) to increase if it is non- deterministic, i.e., if it 
specifies the image of one or more values of X only up 
to a certain probability different than zero or one. This 
will be the case, for example, if the actuator is unable 
to accurately manipulate the dynamics of the controlled 
system, or if any part of the control system is affected by 
external and non-controllable systems. 

From a strict mathematical point of view, note that 
any non-deterministic channel modeling a source of noise 
at the level of actuation or estimation can be represented 
abstractly as a randomly selected deterministic channel 
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FIG. 2: Control diagrams illustrating the purification proce- 
dure for (a) the actuation channel, and (b) the sensor channel. 
Purifying a channel, for instance the sensor channel, simply 
means that knowing the value of X and Z enables one to 
know with probability one the value of C. However, discard- 
ing (viz, tracing out) any information concerning Z leaves us 
with some uncertainty as to which C is reached from a given 
value for X. 



with transition matrix containing only zeros and ones. 
The outcome of a random variable undisclosed to the 
controller can be thought of as being responsible for the 
choice of the channel to use. Figure 2 shows specifi- 
cally how this can be done by supplementing our original 
control graphs of Figure 1 with an exogenous and non- 
controllable random variable Z in order to 'purify' the 
channel considered (actuation or estimation) [33]. For 
the actuation channel, as for instance, the purification 
condition simply refers to the two following properties: 

(i) The mapping from X to X' conditioned on the val- 
ues c and z, as described by the extended transition ma- 
trix p{x'\x, c, z), is deterministic for all c S C and z (z Z; 

(ii) When traced out of Z, p{x'\x,c, z) reproduces the 
dynamics of p(x'|a;, c), i.e., 



p{x'\x,c) = ^p{x'\x,c,z)pz{z), 



(7) 



for all x' , X ^ X, and all c e C. 



III. CONDITIONAL ANALYSIS 

To complement the material introduced in the previ- 
ous section, we now present a technique for analyzing 
the control graphs that emphasizes further the concep- 
tual importance of the actuation channel and its graph- 
ical representation. The technique is based on a useful 
symmetry of Figure Ic that enables us to separate the 
effect of the random variable X in the actuation matrix 
from the effect of the control variable C. From one per- 
spective, the open-loop decomposition 



PX'(2;')opc 



^p{x'\x,c)px{a 



(8) 



suggests that an open-loop control process can be decom- 
posed into an ensemble of actuations, each one indexed 
by a particular value c that takes the initial distribution 
Px{x) to a conditional distribution (first sum in paren- 



theses) 



pi^X |c)opcn — ^p{x'\x,c)px{x). (9) 



The final marginal distribution px'{x')o-pcn is then ob- 
tained by evaluating the second sum in Eq.(8), thus aver- 
aging p(a::'|c)opon over the control variable. From another 
perspective, Eq.(8), re-ordered as 



Px'{x')c 



^p{x'\x,c)pc{c) 



(10) 



indicates that the overall action of a controller can be 
seen as transmitting X through an 'averaged' channel 
(sum in parentheses) whose transition matrix is given by 

p{x'\x) =^p{x'\x,c)pc{c). (11) 

In the former perspective, each actuation subdynamics 
represented by the control graph of Figure Id can be 
characterized by a conditional open-loop entropy reduc- 
tion defined by 



H{X)~H{X'\c] 



open 



where 



H{X'\c) = - V p{x'\c)\ogp{x'\c). 



x'ex 



(12) 



(13) 



(Subscripts of H indicate from which distribution the 
entropy is to be calculated.) In the latter perspective, 
the entropy reduction associated with the unconditional 
transition from X to X' is simply the open-loop entropy 
reduction 



Ai/opcn - H{X) - H{X% 



(14) 



which characterizes the control process as a whole, with- 
out regard to any knowledge of the controller's state. 

For closed-loop control, the decomposition of the con- 
trol action into a set of conditional actuations seems a 
priori inapplicable, for the controller's state itself de- 
pends on the initial state of the controlled system, and 
thus cannot be fixed at will. Despite this fact, one can 
use the Bayesian rule of statistical inference 



p{x\c) 



p{c\x)px{x) 
Pc{c) 



where 



Pc{c) = ^p{c\x)px{x), 



(15) 



(16) 



to invert the dependency between X and C in the sensor 
channel so as to rewrite the closed-loop decomposition in 
the following form: 



PX' (a;') closed =^Pc{ 



^^p(x'|a;, c)p{x\c) 



(17) 
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By comparing this last equation with Eq.(8), wc sec that 
a closed-loop controller is essentially an open-loop con- 
troller acting on the basis oi p{x\c) instead oipx{x) [34]. 
Thus, given that c is fixed, a closed-loop equivalent of 
Eq.(12) can be calculated simply by substituting px{x) 
with p{x\c), thereby obtaining 

Aif,^,„,,d = H{X\c) - H{X'\c) (18) 

for all c. 

The rationale for decomposing a closed-loop control ac- 
tion into a set of conditional actuations can be justified by 
observing that a closed-loop controller, after the estima- 
tion step, can be thought of as an ensemble of open-loop 
controllers acting on a set of estimated staies. In other 
words, what differentiates open-loop and closed-loop con- 
trol from the viewpoint of the actuator is the fact that, 
for the former strategy, a given control action selected 
by C = c transforms all the values x contained in the 
support oi X, i.e., the set 

supp(X) = {x € X -.pxix) > 0}, (19) 

whereas for the latter strategy, namely closed-loop con- 
trol, the same actuation only affects the support of the 
posterior distribution p(x|c) associated with X\c, the ran- 
dom variable X conditioned on the outcome c . This is 
so because the decision as to which control value is used 
has been determined according to the observation of spe- 
cific values of X which are in turn affected by the chosen 
control value. By combining the influence of all the con- 
trol values, we thus have that information gathered by 
the sensor affects the entire control process by inducing 
a covering of the support space 

supp(X) = y supp(X|c), (20) 
cec 

in such a way that values x G supp(X|ci), for a fixed 
Ci e C, are controlled by the corresponding actuation 
channel p{x'\x, C = ci), while other values in supp(X|c2) 
are controlled using p{x'\x,C = C2), and so on for all 
Cj G C. This is manifest if one compares Eqs.(8) and (17). 
Note that a particular value x included in supp(X) may 
be actuated by many different control values if it is part 
of more than one 'conditional' support supp(X|c). Hence 
the fact that Eq.(20 ) only specifies a covering, and not 
necessarily a partition constructed from non-overlapping 
sets. Whenever this occurs, we say that the control is 
mixing. 

To illustrate the above ideas about subdynamics ap- 
plied to conditional subsets of A" in a more concrete set- 
ting, we proceed in the next paragraph with a basic ex- 
ample involving the control of a binary state system using 
a controller restricted to use permutations as actuation 
rules [25] . This example will be used throughout the ar- 
ticle as a test situation for other concepts. 

Example 1. Let C be a binary state controller acting 
on a bit X by means of a so-called controlled-NOT (cnot) 
logical gate. As shown in the circuits of Figures 3a-b, the 



state X , under the action of the gate, is left intact or is 
negated depending on the control value: 

(® stands for modulo 2 addition.) Furthermore, assume 
that the controller's state is determined by the outcome 

of a 'perfect' sensor which can be modeled by another 
CNOT gate such that C = X when C is initially set to 
(Figure 3c). As a result of these actuation rules, it can be 
verified that AH^^^^ = AiJ^j^^^^j = , and so the applica- 
tion of a single open- or closed-loop control action cannot 
increase the uncertainty H{X). In fact, whether the sub- 
dynamics is applied in an open- or closed-loop fashion is 
irrelevant here: a permutation is just a permutation in 
either cases. Now, since C = X, we have that the ran- 
dom variable X conditioned on C = c must be equal to 
c with probability one. For closed-loop control, this im- 
plies that the value X = 0, which is the only clement 
of supp(X|C = 0), is kept constant during actuation, 
whereas the value X = 1 in supp(X|C = 1) is negated to 
in accordance with the controller's state C = 1 (Figure 
3e). Under this control action, the conditional random 
variable X'\c is forced to assume the same deterministic 
value for all c, implying that X' must be deterministic 
as well, regardless of the statistics of C (Figures 3f-g). 
Therefore, 7?(X')cioscd = 0. In contrast, the applica- 
tion of the same actuation rules in an open-loop fashion 
transform the state X to a final state having, at best, 
no less uncertainty than what is initially specified by the 
statistics of X, i.e., H{X')open > H{X). ■ 



IV. ENTROPIC FORMULATION OF 
CONTROLLABILITY AND OBSERVABILITY 

The first instance of the general control problem that 
we now proceed to study involves the dual concepts of 
controllability and observability. In control theory, the 
importance of these concepts arises from the fact that 
they characterize mathematically the input-output struc- 
ture of a system intended to be controlled, and thereby 
determine whether a given control task is realizable or 
not [2, 3]. In short, controllability is concerned with the 
possibilities and limitations of the actuation channel or, 
in other words, the class of control dynamics that can be 
effected by a controller. Observability, on the other hand, 
is concerned with the set of states which are accessible to 
estimation given that a particular sensor channel is used. 
In this section, prompted by preliminary results obtained 
by Lloyd and Slotine [19], we define entropic analogs of 
the widely held control-theoretic definitions of controlla- 
bility and observability, and explore the consequences of 
these new definitions. 
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FIG. 3: ControUed-NOT controller, (a) Boolean circuit illustrating the effect of the controller's state C = on the input states 
0, 1 of the controlled system X (identity in this case), (b) Control action triggered by C = 1 (swapping), (c) Complete control 
system with sensor S and actuator A. Note that the sensor itself is modeled by a CNOT gate, (d)-(g) State of the controlled 
system at different stages of the control depicted in the spirit of conditional analysis, (d) A uniformly distributed input state 
X is measured by the sensor in such a way that the conditional random variable X\c is deterministic (e). (f) The control action 
triggered by C has the effect of swapping the values x for which C = 1. (g) Deterministic probability distribution for the final 
state X' upon averaging over C. 



A. Controllability 

In its simplest expression, a system is said to be con- 
trollable at X = a; if any of the final state X' = x' can 
be reached from X = x using at least one control input 
C = c [2, 3]. Allowing for non-deterministic control ac- 
tions, we may refine this definition and say that a system 
is perfectly controllable at X = a; if it is controllable at 
X = X with probability 1, i.e., if, for any x' , there exists 
at least one c such that p{x'\x,c) = 1 . In other words, 
a system is perfectly controllable if (i) all final states for 
X' are reachable from X = x (complete reachability con- 
dition); and (ii) all final states for X' are connected to 
X = a; by at least one deterministic subdynamics (deter- 
ministic transitions condition). In terms of entropy, these 
two conditions are translated as follows. (The next re- 
sult was originally put forward in [19] without a complete 
proof.) 

Theorem 1. A system is perfectly controllable &i X ^ 
X if and only if there exists a distribution p{c\x) [35] such 
that 

p{x'\x) = Y.p{x'\x, c)p{c\x) ^ (22) 

for all a;', and 

H{X'\x, C) = ^ H{X'\x, c)p{c\x) = 0, (23) 
cec 

where 

H{X'\x, c) = - X! c) logp(x'|x, c). (24) 

x'ex 



Proof. If X is controllable, then for each x' there exists 
at least one control value c = c(a;',a;) G C such that 
p(a;'|a;, c) = 1, and thus H{X'\x,c) = 0. Also, choosing 

supp(C|a;) = {c :p(x'|a;,c) = 1} (25) 

over all x' ^ X and X — x ensures that the average 
conditional entropy over the conditional random variable 
C\x vanishes, and that p{x'\x) ^ 0. This proves the 
direct part of the theorem. To prove the converse, note 
that \i p{x'\x) ^ for a given a;', then there is at least 
one value c for which p{x'\x,c) ^ 0, which means that 
there is at least one subdynamics connecting a; to x'. If in 
addition we have H{X'\x, C) — 0, then we can conclude 
that such a subdynamics must in fact be deterministic. 
As this is verified for any state value a;', we obtain in 
conclusion that for all x' G X there exists a c such that 
p{x'\x, c) = 1. ■ 
In the case where a system is only approximately con- 
trollable, i.e., controllable but not in a deterministic fash- 
ion, the conditional entropy H{X'\x, C) has the desirable 
feature of being interpretable as the residual uncertainty 
or uncontrolled variation left in the output X' when the 
controller's state C is chosen with respect to the initial 
value X [19]. If one regards C as an input to a commu- 
nication channel and X' as the channel output, then the 
degree to which the final state X' is controlled by manip- 
ulating the controller's state can be identified with the 
conditional mutual information I(X';C\x). This latter 
quantity can be expressed either using a formula similar 
to Eq.(4), or by using the expression 

I{X'; C\x) = H(X'\x) - H{X'\x, C), (26) 
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which is a conditional version of the chain rule 
I{X;Y) = H{X) - H{X\Y), 



(27) 



However, H{X'\X, C, Z) = 0, since the knowledge of the 
triplet {x, c, z) is sufficient to infer the value of X' (see 
the conditions in Section H). Hence, 



valid for any random variables X and Y. 

Note that the two above equations allow for another 
interpretation of H{X'\x,C). The conditional entropy 
H{X\Y), entering in (27 ), is often interpreted in com- 
munication theory as representing an information loss 
(the so-called equivocation of Shannon [26]), which re- 
sults from substracting the maximum noiseless capacity 
I{X; X) = H{X) of a communication channel with input 
X and output Y from the actual capacity of that chan- 
nel as measured by I(X;Y). In our case, we can apply 
the same reasoning to Eq.(26), and interpret the quantity 
H{X'\x, C) as a control loss which appears as a negative 
contribution in the expression of I{X'\ C\x), the number 
of bits of accuracy to which specifying the control vari- 
able specifies the output state of the controlled system. 
This means that higher is the quantity H[X'\x, C), then 
higher is the uncertainty or imprecision associated with 
the outcome of X' upon application of the control action. 



B. Complete and average controllability 

In order to characterize the complete controllability of 
a system, i.e., its controllability properties over all possi- 
ble initial states, define 

Lc = mm mX'\X,C) 

{p{c\x)} 

= min Vpx(a;)^if(X'|x,c)Kck) (28) 

as the average control loss. (The minimization over all 
conditional distributions for C is there to ensure that Lc 
reflects the properties of the actuation channel, and docs 
not depend on one's choice of control inputs.) With this 
definition, we have that a system is perfectly controllable 
over the support of X if Lc — and p{x'\x) ^ for all 
x' . In any other cases, it is approximately controllable 
for at least one a;. The proof of this result follows es- 
sentially by noting that, since discrete entropy is positive 
definite, the condition H{X'\X,C) = necessarily im- 
plies h[x'\x,C) = for ah x e supp(X). 

The next two results relate the average control loss 
with other quantities of interest. Control graphs contain- 
ing the purification of the actuation chaimel, as depicted 
in Figure 2, are used throughout the rest of this section. 

Theorem 2. Under the assumption that X' is a de- 
terministic random variable conditioned on the values .r, 
c, and z (purification assumption), we have Lc < H{Z) 
with equality if, and only if, H{Z\X', X, C) = 0. 

Proof. Using the general inequality H{X) < H{X, Y), 
and the chain rule for joint entropies, one may write 

H{X'\X,C) < H{X',Z\X,C) 

= H{Z\X,C) + H{X'\X,C,Z). (29) 



H{X'\X,C) < H{Z\X,C) 
= H{Z), 



(30) 



where the last equality follows from the fact that Z is 
chosen independently of X and C as illustrated in the 
control graph of Figure 2a. Now, from the chain rule 

H{X', Z\X, C) = H{X'\X, C) + H{Z\X', X, C), (31) 

it is clear that equality in the first line of expression (29) 
is achieved if and only if H{Z\X', X, C) = 0. ■ 

The result of Theorem 2 demonstrates that the uncer- 
tainty associated with the control of the state X is upper 
bounded by the noise level of the actuation channel as 
measured by the entropy of Z. This agrees well with the 
fact that one goal of controllers is to protect a system 
against the effects of its environment so as to ensure that 
it is minimally affected by noise. In the limit where the 
control loss vanishes, the state X' of the controlled sys- 
tem should show no variability given that we know the 
initial state and the control action, even in the presence 
of actuation noise, and should thus be independent of the 
random variable Z. This is the essence of the next two 
results which hold for the same conditions as Theorem 2 
(the minimization over the set of conditional probability 
distributions {p(c|a;)} is implied at this point). 

Theorem 3. Lc = I{X'; Z\X,C). 

Proof. From the chain rule of mutual information, we 

can easily derive 

I{X'; Z\X, C) = H(X'\X, C) - H{X'\X, C\ Z). (32) 

Thus, I{X'-Z\X,C) = H{X'\X,C) if we use again the 
deterministic property of the random variable X'\x,c,z 
upon purification of p{x'\x, c). I 

Theorem I Lc = I{X'; X, C, Z) - I{X'; X, C). 

Proof. Using the chain rule of mutual information, we 
write 

I{X'-X,C,Z) = H(X')-H{X'\X,C,Z) 
= H{X') - H{X'\X,C,Z) 

+H{X'\X, C)-H{X'\X, C) 
= I{X';X,C)+I{X';Z\X,C).{33) 

For the last equality, we have used Eq.(32). Now, by sub- 
stituting Lc = I{X'-, Z\X, C) from the previous theorem, 
we obtain the desired result. ■ 
As a direct corollary of these two results, we have that 

a system is completely and perfectly controllable if. and 
only if, I{X'\ Z\X, C) is equal to zero or equivalently if, 
and only if, 



I{X';X,C,Z)=I{X'-X,C). 



(34) 



Hence, a necessary and sufficient entropic condition for 

perfect controllability is that the final state of the con- 
trolled system, after the actuation step, is statistically 
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independent of the noise variable Z given X and C. In 
that case, the 'information' 1{X'\ Z\X, C) conveyed in 
the form of noise from Z to the controlled system is zero. 
Another 'common sense' interpretation of this result can 
be given if the quantity I{X'; Z\X, C) is instead viewed 
as representing the 'information' about X' that has been 
transferred to the non-controllable state Z in the form of 
'lost' correlations. 

This analysis of control systems in terms of noise 
and information protection is similar to that of error- 
correcting codes. The design of error-correcting codes is 
closely related to that of control systems: the information 
duplicated by a code, when corrupted by noise, is used 
to detect errors (sensor step) which are then corrected by 
enacting specific correcting or erasure actions (actuation 
step) [26, 36, 37]. The analogy to error-correcting codes 
can be strengthened even further if probabilities account- 
ing for undetected and uncorrected errors are modeled by 
means of communication channels similar to the sensor 
and actuation channels. In this context, whether or not a 
prescribed set of erasure actions is sufficient to correct for 
a particular type of errors is determined by the control 
loss. 



C. Observability 

The concept of observability is concerned with the issue 

of inferring the state X of the controlled system based 
on some knowledge or data of the state provided by a 
measurement apparatus, taken here to correspond to C. 
More precisely, a controlled system is termed perfectly 
observable if the sensor's transition matrix p{c\x) maps 
no two values of X to a single observational output value 
c, or in other words if for all c G C there exists only 
one value x such that p{x\c) = 1. As a consequence, we 
have the following result [19]. (We omit the proof which 
readily follows from well-known properties of entropy.) 

Theorem 5. A system with state variable X is per- 
fectly observable, with respect to all observed value c G 
supp(C), if and only if 

H{X\C) = ^ H{X\c)pcic) = 0. (35) 
cec 

The information-theoretic analog of a perfectly observ- 
able system is a lossless communication channel X —> Y 
characterized by H{X\Y) = for all input distributions 
[32]. As a consequence of this association, we interpret 
the conditional entropy H{X\C) as the information loss, 
or sensor loss, of the sensor channel, denoted by Lg. We 
now extend our results on controllability into the domain 
of observability. The first question that arises is, given 
the similarity between the average control loss Lc and 
the sensor loss, do we obtain true results for observabil- 
ity by merely substituting Lc by Lg in Theorems 2 and 
3? 

The answer is no: the fact that a communication chan- 
nel is lossless has nothing to do with the fact that it can 



be non-deterministic. An example of such a channel is 
one that maps the singleton input set X = {0} to multi- 
ple instances of the output set C with equal probabilities. 
This is clearly a non-deterministic channel, and yet since 
there is only one possible value for X, the conditional en- 
tropy H{X\c) must be equal to zero for all c G C. Hence, 
contrary to Theorem 2, the observation loss Lg cannot be 
bounded above by the entropy of the random variable re- 
sponsible for the non-deterministic properties of the sen- 
sor channel. However, we are not far from a similar re- 
sult: by analyzing the meaning of the sensor loss a bit 
further, the generalization of Theorem 2 for observability 
can in fact be derived using the 'backward' version of the 
sensor channel. More precisely, Ls < H{Zb) where Zb is 
now the random variable associated with the purification 
of the transition matrix p{x\c). To prove this result, the 
reader may revise the proof of Theorem 2, and replace 
the forward purification condition H(C\X,Z) = for the 
sensor channel by its backward analog H{X\C, Zb) = 0. 

To close this section, we present next what is left to 
generalization of the results on controllability. One ex- 
ample aimed at illustrating the interplay between the 
controllability and observability properties of a system 
is also given. 

Theorem 6. If the state X is perfectly observable, 
then I{X; Z\C) = 0. (The random variable Z stands for 
the purification variable of the 'forward' sensor channel 
p{c\x).) 

Proof. The proof is rather straightforward. Since 
H{X\C) > H{X\C,Z), the condition Ls = implies 
H{X\C, Z) = 0. Thus by the chain rule 

I{X-Z\C) = H{X\C)-H{X\C,Z), (36) 

we conclude with I{X; Z\C) = 0. ■ 
Corollary 7. If Lg = 0, then /(X; C, Z) = I{X; C). 
The interpretations of the two above results follow 
closely those given for controllability. We will not dis- 
cuss these results further except to mention that, con- 
trary to the case of controllability, I{X; Z\C) = is not 
a sufficient condition for a system to be observable. This 
follows simply from the fact that I{X; Z\C) = implies 
H{X\C) = H{X\C, Z), and at this point the purification 
condition H{C\X, Z) — for the sensor channel is of no 
help to obtain H{X\C) = 0. 

Example 2. Consider again the control system of Fig- 
ure 3. Given the actuation rules described by the CNOT 
logical gate, it can be verified easily that for X = or 
1, H{X'\x,C) = and p{x'\x) ^ for all x' . There- 
fore, the controlled system is completely and perfectly 
controllable. This implies, in particular, that AH^p^^ = 
A/f^igggj = 0, and that the final state of the controlled 
system may be actuated to a single value with probability 
1, as noted before. For the latter observation, note that 
X' = x' with probability 1 so long as the initial state X is 
known with probability 1 (perfectly observable). In gen- 
eral, if a system is perfectly controllable (actuation prop- 
erty) and perfectly observable (sensor property), then it 
is possible to perfectly control its state to any desired 
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value with vanishing probabihty of error. In such a case, 
we can say that the system is closed-loop controllable. ■ 

D. The CEise of continuous random variables 

The concept of a deterministic continuous random vari- 
able is somewhat ill-defined, and, in any case, cannot be 
associated with the condition H{X) = formally. (Con- 
sider, e.g., the peaked distribution p(a;) = 6{x—xo) which 
is such that H{X) = — oo.) To circumvent this difficulty, 
controllability and observability for continuous random 
variables may be extended via a quantization or coarse- 
graining of the relevant state spaces [32]. For example, 
a continuous-state system can be defined to be perfectly 
controllable at x if for every final destination x' there ex- 
ists at least one control value c which forces the system 
to reach a small neighborhood of radius A > around 
x' with probability 1. Equivalently, x can be termed 
perfectly controllable to accuracy A if the variable x^ 
obtained by quantizing A" at a scale A is perfectly con- 
trollable. Similar definitions involving quantized random 
variables can also be given for observability. The recourse 
to the quantized description of continuous variables has 
the virtue that H{X^) and H{X'^\C^) are well-defined 
functions which cannot be infinite. It is also the natu- 
ral representation used for representing continuous-state 
models on computers. 

V. STABILITY AND ENTROPY REDUCTION 

The emphasis in the previous section was on proving 
upper limits for the control and the observation loss, and 
on finding conditions for which these losses vanish. In 
this section, we depart from these quantities to focus our 
attention on other measures which are interesting in view 
of the stability properties of a controlled system. How 
can a system be stabilized to a target state or a target 
subset (attractor) of states? Also, how miich informa- 
tion does a controller need to gather in order to achieve 
successfully a stabilization procedure? To answer these 
questions, we first propose an entropic criterion of stabil- 
ity, and justify its usefulness for problems of control. In a 
second step, we investigate the quantitative relationship 
between the closed- loop mutual information I{X;C) and 
the gain in stability which results from using information 
in a control process. 

A. Stochcistic stability 

Intuitively, a stable system is a system which, when 

activated in the proximity of a desired operating point, 
stays relatively close to that point indefinitely in time, 
even in the presence of small perturbations. In the field 
of control engineering, there exist several formalizations 
of this intuition, some less stringent than others, whose 



range of applications depend on theoretical as well as 
practical considerations. It would be impossible, and, 
perhaps inappropriate, to review here all the definitions 
of stability currently used in the study and design of con- 
trol systems; for our purposes, it suffices to say that a 
necessary condition for stabilizing a dynamical system is 
to be able to decrease its entropy, or immunize it from 
sources of entropy like those associated with environment 
noise, motion instabilities, and incomplete specification 
of control conditions. This entropic aspect of stabiliza- 
tion is implicit in almost all criteria of stability insofar 
as a probabilistic description of systems focusing on sets 
of responses, rather than on individual response one at 
a time, is adopted [38-41]. In this sense, what is usually 
sought in controlling a system is to confine its possible 
states, trajectories or responses within a set as small as 
possible (low entropy final state) starting from a wide 
range of initial states or initial conditions (high entropy 
initial random state). 

The fundamental role of entropy reduction in control 
suggests the two following problems. First, given the 
initial state X and its entropy H{X) , a set of actuation 
subdynamics, and the type of controller (open- or closed- 
loop), what is the maximum entropy reduction achievable 
during the controlled transition from X to X'l Second, 
what is the quantitative relationship between the maxi- 
mal open-loop entropy reduction and the closed-loop en- 
tropy reduction? Note that for control purposes it does 
not suffice to reduce the entropy of X' conditionally on 
the state of another system (the controller in particular) . 
For instance, the fact that H{X'\C) vanishes for a given 
controller acting on a system does not imply by itself 
that H{X') must vanish as well, or that X' is stabilized. 
What is required for control is that actuators modify the 
dynamics of the system intended to be controlled by act- 
ing directly on it, so as to reduce the marginal entropy 
H{X') . This unconditional aspect of stability has been 
discussed in more detail in [25, 42]. 

B. Open-loop control optimality 

Using the concavity property of entropy, and the fact 
that Ai7open is upper bounded by the maximum of 
AiJ^pjjjj over all control values c. we show in this sec- 
tion that the maximum decrease of entropy achieved by 
a particular subdynamics of control variable 

c = argmaxAi?„^ (37) 

is open-loop optimal in the sense that no random (i.e., 
non-deterministic) choice of the controller's state can im- 
prove upon that decrease. More precisely, we have the 
following results. (Theorem 9 was originally stated with- 
out a proof in [25].) 

Lemma 8. For any initial state X, the open-loop en- 
tropy reduction Affopen satisfies 

AiJopen < Aiffp,„, (38) 
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where 



open 



= HiX)-H{X'\C)op, 



(39) 



with AiJgpgjj defined as in Eq.(12). The equality is 
achieved if and only if I{X'; C) = 0. 

Proof. Using the inequality H{X') > H{X'\C), we 
write directly 



Aifopen = H{X) — H{X')open 
< H{X)-H{X'\C)ope 



(40) 



Now, let us prove the equality part. If C is statistically 
independent of X', then H{X'\C) = H{X'), and 



A rr A irC 



(41) 



Conversely, the above equality implies H{X'\C) = 
H{X'), and thus we must have that C is independent 

oix'. m 

Theorem 9. The entropy reduction achieved by a set 
of actuation subdynamics used in open-loop control is 
always such that 



Affopen < maxAiI^pg„, 



(42) 



for all px{x). The equality can always be achieved for 
the deterministic controller C = c, with c defined as in 
Eq.(37). 

Proof. The average conditional entropy H{X'\C) is 
always such that 



miniJ(X'lc) < Y,Pc{c)H{X'\c). (43) 
cec 



Therefore, making use of the previous lemma, we obtain 



A-ffopen < ^H^pen 



< H{X) - mm H{X'\c) 
= maxA/f^„„„. 



cGC 



open* 



(44) 



Also, note that if C = c with probability 1, then the 
two above inequalities arc saturated since in this case 
J(X' ; C) = and AHg,,, = AH^,,, . ^ ■ 

An open-loop controller or a control strategy is called 
pure if the control random variable C is deterministic, 
i.e., if it assumes only one value with probability 1. An 
open-loop controller that is not pure is called mixed. (We 
also say that a mixed controller activates a mixture of 
control actions.) In view of these definitions, what we 
have just proved is that a pure controller with C = c is 
necessarily optimal; any mixture of the control variable 
either achieves the maximum entropy decrease prescribed 
by Eq.(42) or yields a smaller value. As shown in the next 
example, this is so even if the actuation subdynamics 
used in the control process are deterministic. 



Example 3. For the CNOT controller of Example 1, 
we noted that if(Ar')opon = H{X), or cquivalcntly that 
AiJopcn ~ 0, only at best. To be more precise, AH open = 
only if a pure controller is used or if H{X) = 1 bit (al- 
ready at maximum entropy). If the control is mixed, and 
if H{X) < 1 bit, then A/fopen must necessarily be nega- 
tive. This is so because uncertainty as to which actuation 
rule is used must imply uncertainty as to which state the 
controlled system is actuated to. ■ 

Note that purity alone is not a sufficient condition for 
open-loop optimality, nor it is a necessary one in fact. 
To see this, note on the one hand that a pure controller 
having 



C = argminAiJ^ 



(45) 



with probability one is surely not optimal, unless all en- 
tropy reductions AH^^^ have the same value. On the 
other hand, to prove that a mixed controller can be op- 
timal, note that if any subset Co Q C oi actuation sub- 
dynamics is such that p{x'\c) = px'{x'), and AH^^^^ as- 
sumes a constant value for all c e Co, then one can build 
an optimal controller by choosing a non-deterministic dis- 
tribution p{c) with supp(C) = Co- 



C. Closed-loop control optimality 

The distinguishing characteristic of an open-loop con- 
troller is that it usually fails to operate efficiently when 
faced with uncertainty and noise. An open-loop con- 
troller acting independently of the state of the controlled 
system, or solely based on the statistical information pro- 
vided by the distribution px{x), cannot reliably deter- 
mine which control subdynamics is to be applied in order 
for the initial (a priori unknown) state X to be propa- 
gated to a given target state. Furthermore, an open-loop 
control system cannot compensate actively in time for 
any disturbances that add to the actuator's driving state 
(actuation noise). To overcome these difficulties, the con- 
troller must be adaptive: it must be capable of estimat- 
ing the mipredictable features of the controlled system 
during the control process, and must be able to use the 
information provided by estimation to decide of specific 
control actions, just as in closed-loop control. 

A basic closed-loop controller was presented in Exam- 
ple 1. For this example, we noted that the perfect knowl- 
edge of the initial state's value (X = or 1) enabled the 
controller to decide which actuation subdynamics (iden- 
tity or permutation) is to be used in order to actuate 
the system to X' = Q with probability 1. The fact that 
the sensor gathers I{X;C) = H{X) bits of information 
during estimation is a necessary condition for this spe- 
cific controller to achieve H{X')c\oscd = 0, since hav- 
ing I{X; C) < H{X) may result in generating the value 
X' = 1 with non-vanishing probability. In general, just 
as a subdynamics mapping the input states {0, 1} to the 
single value {0} would require no information to force 



11 



X' to assume the value 0, we expeet that the closed-loop 
entropy reduction should not only depend on /(X;C), 
the effective information available to the controller, but 
should also depend on the reduction of entropy attainable 
by open-loop control. The next theorem, which consti- 
tutes the main result of this work, embodies exactly this 
statement by showing that one bit of information gath- 
ered by the controller has a maximum value of one bit 
in the improvement of entropy reduction that closed-loop 
gives over open-loop control. 

Theorem 10. The amount of entropy 



AF, 



closed 



H{X)-H(X'] 



closed 



(46) 



that can be extracted from a system with given initial 
state X by using a closed-loop controller with fixed set 
of actuation subdynamics satisfies 



Aifclosed < Aif-f„ + I{X- C). 



where 



A rrmax 
open 



= max 

px(x)ev,cec 



Am 



(47) 



(48) 



is the maximum entropy decrease that can be obtained 
by (pure) open-loop control over any input distribution 
chosen in the set V of all probability distributions. 

A proof of the result, based on the conservation of 
entropy for closed systems, was given in [25] following 
results found in [43, 44]. Hero, we present an alternative 
proof based on conditional analysis which has the ad- 
vantage over our previous work to give some indications 
about the conditions for equality in (47). Some of these 
conditions are derived in the next section. 

Proof. Given that AH^^ is the optimal entropy re- 
duction for open-loop control over any input distribution, 
we can write 



ir(^')open > H{X) 



A trmax 
^-"open- 



(49) 



Now, using the fact that a closed-loop controller is for- 
mally equivalent to an ensemble of open-loop controllers 
acting on the conditional supports supp(X|c) instead of 
supp(X), we also have for all c e C 



H{X'\cUosed > H{X\c) - AH, 
and, on average. 



max 
open) 



H{X'\CUosed > H{X\C) 



open ■ 



(50) 



(51) 



That AiJ™^^ must enter in the lower bounds of 
i?(X')opcn and H(X')c\oscd can be explained in other 
words by saying that each conditional distribution j5 (a; |c) 
is a legitimate input distribution for the initial state of 
the controlled system. It is, in any cases, an element of 
V. This being said, notice now that H{X') > H{X'\C) 
implies 



H{X')aosed > H{X\C) 



A t/niax 
^-"open- 



(52) 



Hence, we obtain 



AH, 



closed 



< 



H{X)-H{X\C) + AH^^:,,, 
= 7(X;C) + AF,T-d, (53) 



which is the desired upper bound. To close the proof, 
note that AH^^ cannot be evaluated using the initial 
distribution px {x) alone because the maximum reduction 
of entropy in open- loop control starting from px {x) may 
differ from the reduction of entropy obtained when some 
actuation channel is applied in closed- loop to p{x\c). See 
[42] for a specific example of this. ■ 

The above theorem enables us to finally understand 
all the results of Example 1. As noted already, since the 
actuation subdynamics consist of permutations, we have 
AH^^-^ = for any distribution px (x) . Thus, we should 
have Ai/cioscd < I{X',C). For the particular case stud- 
ied where C = X, the controller is found to be optimal, 
i.e., it achieves the maximum possible entropy reduction 
AiJciosod = I{X;C). This proves, incidentally, that the 
bound of inequality (47) is tight. In general, we may 
define a control system to be optimal in terms of infor- 
mation if the gain in stability obtained by substracting 

(X')open from H{X')ciosed is exactly equal to the sensor 
mutual information I{X;C). Equivalently, a closed- loop 
control system is optimal if its efficiency ri, defined by 



7] = 



H{X') 

closed 



(54) 



is equal to 1. 

Having determined that optimal controllers do exist, 
we now turn to the problem of finding general conditions 
under which a given controller is found to be either opti- 
mal {rj = 1) or sub-optimal {rj < 1). By analyzing thor- 
oughly the proof of Theorem 10, one finds that the as- 
sessment of the condition I{X'; C) = 0, which was not a 
sufficient condition for open-loop optimality, is again not 
sufficient here to conclude that a closed-loop controller is 
optimal. This comes as a result of the fact that not all 
control subdynamics applied in a closed-loop fashion are 
such that AiJ^j^g^^ = AH™^^ in general. Therefore the 
average final condition entropy H{X'\C) dosed need not 
necessarily be equal to the bound imposed by inequality 
(51). However, in a scenario where the entropy reduc- 
tions A_ff Qpjjjj and AH^^^^^^ are both equal to a constant 
for all control subdynamics, then we effectively recover 
an analog of the open-loop optimality condition, namely 
that a zero mutual information between the controller 
and the controlled system after actuation is a necessary 
and sufficient condition for optimality. 

Theorem 11. Under the condition that, for all c e C, 



ah: 



open 



AH. 



closed 



-Ai/, 



(55) 



where AH is a constant, then a closed- loop controller is 
optimal if and only if I{X'\ C) = 0. 

Proof. To prove the sufficiency part of the theorem, 
note that the constancy condition (55) implies that the 
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minimum for _ff (X')opon equals H{X) — AH. Similarly, 
closed-loop control must be such that 



H{X'\C%iosed = H{X\C) - AH. 



(56) 



Combining these results with the fact that I{X'; C) = 0, 
or equivalently that 



H{X')cios<id = H{X'\C) 

closed? 



(57) 



we obtain 



H{X')%1^ - H{X'U^d = H{X) - H{X\C) 

= I{X;C). (58) 

To prove the converse, namely that optimality under con- 
dition (55) implies I{X';C) = 0, notice that Eq.(56) 
leads to 



F(X')^'",-iJ(X'|C)closed = H{X)-H{X\C) 



I{X-C). 



(59) 



Hence, given that we have optimality, i.e., given Eq.(58), 
then X' must effectively be independent of C. ■ 
Example 4- Consider again the now familiar CNOT con- 
troller. Let us assume that instead of the perfect sensor 
channel C = X, we have a binary symmetric channel 
such that p{c = x\x) = 1 — e and p{c = x ® l|x) = e 
where < e < 1, i.e., an error in the transmission occurs 
with probability e [32]. The mutual information for this 
channel is readily calculated to be 

I{X;C) = H{C)- ^ p{x)H{C\x) 
xe{o,i} 

= H{C) - H{e), (60) 



where 



H{e) = — eloge — (1 — e) log(l — e) 



(61) 



is the binary entropy function. By proceeding similarly 
as in Example 1, the distribution of the final controlled 

state can be calculated. The solution is px' (0) = 1 — e 
and px'(l) = e, so that H{X') = H{e) and 



A7?eloscd = i?(X)-J?(e). 



(62) 



By comparing the value of AiJciosod with the mutual in- 
formation I{X]C) (rccaU that AiJ™^^ — 0), wc arrive 
at the conclusion that the controller is optimal for e = 0, 
e = 1 (perfect sensor channel), and for H{X) = 1 (maxi- 
mum entropy state). In going through more calculations, 
it can be shown that these cases of optimality are all such 
that I{X'; C) = 0. ■ 



D. Continuous-time limit 

To derive a differential analog of the closed-loop op- 
timality theorem for systems evolving continuously in 



time, one could try to proceed as follows: sample the 
state, say X{t), of a controlled system at two time in- 
stants separated by some (infinitesimal) interval At, and 
from there directly apply inequality (47) to the open- 
and closed-loop entropy reductions associated with the 
two end-points X{t) and X{t + At) using I{X{ty,C{t)) 
as the information gathered at time t. However sound 
this approach might appear, it unfortunately proves to 
be inconsistent for many reasons. First, although one 
may obtain well-defined rates for H[X(t)) in the open- 
or closed-loop regime, the quantity 



lim 

At^o 



I(X(l):C{t)) 
At 



(63) 



does not constitute a rate, for I{X{t)\ C{t)) is not a dif- 
ferential element which vanishes as At approaches 0. Sec- 
ond, our very definition of open-loop control, namely 
the requirement that I{X;C) be equal to prior to 
actuation, fails to apply for continuous-time dynamics. 
Indeed, open-loop controllers operating continuously in 
time must always be such that I{X{t); C{t)) ^ if pur- 
poseful control is to take place. Finally, are we allowed to 
extend a result derived in the context of a Markovian or 
memoryless model of controllers to sampled continuous- 
time processes, even if the sampled version of such pro- 
cesses has a memoryless structure? Surely, the answer is 
no. 

To overcome these problems, we suggest the following 
conditional version of the optimality theorem. Let X(t — 
At), X{t) and X{t + At) be three consecutive sampled 
points of a controlled trajectory X{t). Also, let C{t — 
At) and C{t) be the states of the controller during the 
time interval in which the state of the controlled system 
is estimated. (The actuation step is assumed to take 
place between the time instants t and t + At.) Then, by 
redefining the entropy reductions as conditional entropy 
reductions following 



AH* = H{X{t)\C 



t-At\ 



//(X(t + Ai)|C*-^*), (64) 



where C* represents the control history up to time t, we 
must have 



+ I{X{t):C{t)\C'-'^') 



(65) 



Note that by thus conditioning all quantities with C*~^*, 
we extend the applicability of the closed-loop optimality 
theorem to any class of control processes, memoryless or 
not. Now, since 

/(X(i- Ai);C(f- At)|C*-^*) = (66) 

by the definition of the mutual information, we also have 

Aif*i„,,d < Ai/*p,, + /(X(f);C(i)|C*-^*) 

-I{X{t - At); C{t - Ai)|C*-^*). (67) 

As a result, by dividing both sides of the inequality by 
At, and by taking the limit At — > 0, we obtain the rate 
equation 



H 



closed 



^ -^open 4" ^ 



(68) 
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if, indeed, the limit exists. This equation relates the 
rate at which the conditional entropy _ff(X(t)|C*~^*) 
is dissipated in time with the rate at which the condi- 
tional mutual information I{X{t);C{t)\C*~^*) is gath- 
ered upon estimation. The difference between the above 
information rate and the previous pseudo-rate reported 
in Eq.(63) lies in the fact that I{X{t);C{t)\C*-^*) rep- 
resents the differential information gathered during the 
latest estimation stage of the control process. It does not 
include past correlations induced by the control history 
Qt-At_ rpj^-g gQj.^ q£ conditioning allows, in passing, a 
perfectly meaningful re-definition of open-loop control in 
continuous-time, namely / = 0, since the only correla- 
tions between X(t) and C{t) which can be accounted for 
in the absence of direct estimation are those due to the 
past control history. 



VI. APPLICATIONS 

A. Proportional controllers 

There are several controllers in the real world which 
have the character of applying a control signal with am- 
plitude proportional to the distance or error between 
some estimate X of the state X, and a desired target 
point X* . In the control engineering literature, such con- 
trollers are designated simply by the term proportional 
controllers [38]. As a simple version of a controller of 
this type, we study in this section the following system: 



X' = X-C 
C = X-x*, 



(69) 



with all random variables assuming values on the real 
line. For simplicity, we set x* = and consider two differ- 
ent estimation or sensor channels defined mathematically 

by 



Ca = X 



(70) 



and 



Cz = X = X + Z, 



(71) 

where Z ^ A/'(0, N) (Gaussian distribution with zero 
mean and variance N). The first kind of estimation, 
Eq.(70), is a coarse-grained measurement of X with a 
grid of size A; it basically allows the controller to 'see' X 
within a precision A, and selects the middle coordinate 
of each cell of the grid as the control value for Ca- The 
other sensor channel represented by the control state Cz 
is simply the Gaussian channel with noise variance N. 

Let us start our study of the proportional controller by 
considering the coarse-grained sensor channel first. If we 
assume that X ~ U{0,s) (uniform distribution over an 
interval e centered around 0), and pose that e/A is an 
integer, then we must have 



Now, to obtain px' (2^' ) closed, note that the conditional 
random variables X\c defined by conditional analysis are 
all uniformly distributed over non-overlapping intervals 
of width e/A, and that, moreover, all of these intervals 
must be moved under the control law around X' = 
without deformation. Hence, X' U{0,A), and 



AH, 



closed 



log € — log A 

l0g(£/A). 



(73) 



These results, combined with the fact that Aff™'^^ = 
, prove that the coarse-grained controller is always opti- 
mal, at least provided again that £ is a multiple of A. 

In the case of the Gaussian channel, the situation for 
optimality is different. Under the application of the esti- 
mation law (71), the final state of the controlled system 
is 

X' = X - C = X - {X + Z) -Z, (74) 

so that X' ~ Z. This means that if we start with X ~ 
Af{0,P), then 

AiJciosed = ^ log(27reP) - i log(27reiV) 



and 



7(X;Cz) = ilog (1 + ^ 



(75) 
(76) 



I{X;C\)=\og{e/A). 



(72) 



Again, Aif™|^ = (recall that Ai?™^^ does not depend 
on the choice of the sensor channel), and so we conclude 
that optimality is achieved only in the limit where the 
signal-to-noise ratio goes to infinity. Non-optimality, for 
this control setup, can be traced back to the presence of 
some overlap between the different conditional distribu- 
tions p{x\c) which is responsible for the mixing upon ap- 
plication of the control. As P/N 00, the 'area' covered 
by the overlapping regions decreases, and so is I{X'; C). 
Based on this observation, we have attempted to change 
the control law slightly so as to minimize the mixing in 
the control while keeping the overlap constant and found 
that complete optimality for the Gaussian channel con- 
troller can be achieved if the control law is modified to 

X' = X- 7C, (77) 

with a gain parameter 7 set to 

This controller can readily be verified to be optimal. 

B. Noisy control of chaotic maps 

The second application is aimed at illustrating the 
closed-loop optimality theorem in the context of a con- 
troller restricted to use entropy-increasing actuation dy- 
namics, as is often the case in the control of chaotic 
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FIG. 4: (a) Typical uncontrolled trajectory of the logistic map 
with r — 3.7825. (b) Controlled trajectory which results from 
applying the OGY feedback control at time n = 50. Note the 
instant resurgence of instability as the control is switched off 
at n = 150. The gain for this simulation was set to 7 = —7.0, 
and D = [0.725,0.745]. (c) Entropy (in arbitrary 

units) associated with the position of the controlled system 
versus time (see text). 



systems. To this end, we consider the feedback control 
scheme proposed by Ott, Grcbogi and Yorke (OGY) [45] 
as applied to the logistic map 



(79) 



where .t„ G [0,1], and r„ E [0,4], n = 0,1,2,.... In a 
nutshell, the OGY control method consists in setting the 
control parameter r„ at each time step n according to 



Srn = -7(c« - X*) 



(80) 



whenever the estimated state c„ = x„ falls into a small 
control region D in the vicinity of a target point x*. This 
target state is usually taken to be an unstable fixed point 
satisfying the equation f(r,x*) = x*, where f{r,x*) is 
the unperturbed map having r„ = r as a constant con- 
trol parameter. Moreover, the gain 7 is fixed so as to 
ensure that the trajectory {a;„}^Q is stable under the 
control action. (See [46, 47] for a derivation of the stabil- 
ity conditions for 7 based on linear analysis, and [48, 49] 
for a review of the field of chaotic control.) 

Figure 4 illustrates the effect of OGY controller when 
applied to the logistic map. The plot of Figure 4a shows 
a typical chaotic trajectory obtained by iterating the dy- 
namical equation (79) with r„ = r = 3.7825. Note on this 
plot the presence of non-recurring oscillations around the 
unstable fixed point x*{r) = (r — l)/r ~ 0.7355. Figure 
4b shows the orbit of the same initial point xq now stabi- 
lized by the OGY controller around x* for n G [50, 150]. 



For this latter simulation, and more generally for any ini- 
tial points in the unit interval, the controller is able to 
stabilize the state of the logistic map in some region sur- 
rounding X*, provided that 7 is a stable gain, and that 
the sensor channel is not too noisy. To evidence the sta- 
bility properties of the controller, we have calculated the 
entropy H{Xn) by constructing a normalized histogram 
Px„{xn) of the positions of a large ensemble of trajec- 
tories (^ 10'*) starting at different initial points. The 
result of this numerical computation is shown in Figure 
4c. On this graph, one can clearly distinguish four differ- 
ent regimes in the evolution of H{Xn), numbered from (i) 
to (iv), which mark four different regimes of dynamics: 

(i) Chaotic m,otion with constant r: Exponential di- 
vergence of nearby trajectories initially located in a very 
small region of the state space. The slope of the lin- 
ear growth of entropy, the signature of chaos [50, 51], is 
probed by the value of the Lyapunov exponent 



X(r) = lim 



1 

N 



El- 



n=0 



df{r,x) 




dx 





(81) 



(ii) Saturation: At this point, the distribution of posi- 
tions px„{xn) for the chaotic system has reached a lim- 
iting or equilibrium distribution which nearly fills all the 
unit interval. 

(iii) Transient stabilization: When the controller is ac- 
tivated, the set of trajectories used in the calculation of 
H{Xn) is compressed around x* exponentially rapidly in 
time. 

(iv) Controlled regime: An equilibrium situation is 
reached whereby H[Xn) stays nearly constant. In this 
regime, the system has been controlled down to a given 
residual entropy which specifies the size of the basin of 
control, i.e., the average distance from .t* to which 
has been controlled. 

It is the size of the basin of control, and, more pre- 
cisely, its dependence on the amount of information pro- 
vided by the sensor channel which is of interest to us 
here. In order to study this dependence, we have simu- 
lated the OGY controller, and have compared the value 
of the residual entropy H{Xn) for two types of sensor 
channel: the coarse-grained channel C„ = C/\{Xn), and 
the Gaussian channel C„ = Cz(X„). 

In the case of the coarse-grained channel, we have 
found that the distribution of Xn in the controlled regime 
was well approximated by a uniform distribution of width 
£ centered around the target point x* . Thus, the indica- 
tor value for the size of the basin of control is taken to 
correspond to 



(82) 



which, according to the closed-loop optimality theorem, 
must be such that 



(83) 



where A* is the Lyapunov exponent associated with the 
r value of the unperturbed logistic map, and where £m is 
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FIG. 5: (Data points) Control interval e as a function of the effective coarse-grained interval of measurement em for four 

different target points. (Solid line) Optimal linear relationship predicted by the closed-loop optimality theorem. The values of 
r and the Lyapunov exponents A* associated with the target points are listed in Table 1 and displayed in Figure 6. 



the coarse-grained measurement interval or precision of 
the sensor channel. (All logarithms arc in natural base in 
this section.) To understand the above inequality, note 
that a uniform distribution for X„ covering an interval 
of size S must stretch by a factor e'^^''^ after one iteration 
of the map with parameter r. This follows from the fact 
that A(r) corresponds to an entropy rate of the dynam- 
ical system [50, 51] (see also [40, 41]), and holds in an 
average sense inasmuch as the support of Xn is not too 
small or does not cover the entire unit interval. Now, for 
open- loop control, it can be seen that if A(r) > for all 
admissible control values r, then no control of the state 
Xn is possible, and the optimal control strategy must 
consist in using the smallest Lyapunov exponent Amin 
available in order to achieve 



open 



H(Xn) — H(Xn+l)ope-a 

ln(5-lne^-"(5 

~Amin < 0. 



(84) 



In the course of the simulations, we noticed that only 
a very narrow range of r values were actually used in 
the controlled regime, which means that AH^^^ can be 
taken for all purposes to be equal to —A*. At this point, 
then, we need only to use expression (72) for the mutual 
information of the coarse-grained channel, substituting 
A with Em, to obtain 



< -A* +ln(e/£ 



(85) 



This expression yields the aforementioned inequality by 
posing Affciosed = (controlled regime). 

The plots of Figure 5 present our numerical calcula- 
tions of £ as a function of e„, . Each of these plots has 
been obtained by calculating Eq.(82) using the entropy 
of the normalized histogram of the positions of about 10** 
different controlled trajectories. Other details about the 
simulations may be found in the caption. What differ- 
entiates the four plots is the fixed point to which the 
ensemble of trajectories have been stabilized, and, ac- 
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FIG. 6: (a) Lyapunov spectrum (r, A(r)) of the logistic map. 
The positive Lyapunov exponents associated with the four 
target points listed in Table 1 are located by the circles. The 
set of r values used during the control spans approximately 
the diameter of the circles. Note that the few negative values 
of A(r) close to the A*'s are effectively suppressed by the noise 
in the sensor channel. This is evidenced by the graph of (b) 
which was obtained by computing the sum (81) up to AT = 2 x 
10'' with an additive noise component of very small amplitude. 
See [52, 53] for more details on this point. 



TABLE I: Characteristics of the four target points. 



Target point 


* 

X 


r 


A* (base e) 


1 


0.7218 


3.5950 


0.1745 


2 


0.7284 


3.6825 


0.3461 


3 


0.7356 


3.7825 


0.4088 


4 


0.7455 


3.9290 


0.5488 



cordingly, the value of the Lyapunov exponent A* associ- 
ated to x*(t). These are listed in Table 1 and illustrated 
in Figure 6. One can verify on the plots of Figure 5 
that the points of e versus £,„ all lie above the critical 
line (solid line in the graphs) which corresponds to the 
optimality prediction of inequality (83). Also, the rela- 
tively small departure of the numerical data from the op- 
timal prediction shows that the OGY controller with the 
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FIG. 7: (Data points) Dispersion P characterizing the basin of attraction of the controlled system as a function of the noise 
power N introduced in the Gaussian sensor channel. The horizontal and vertical axes are to be rescaled by a factor 10~®. 
(Solid line) Optimal lower bound. 



coarse-grained channel is nearly optimal with respect to 
the entropy criterion. This may be explained by noticing 
that this sort of controller complies with all the require- 
ments of the first class of linear proportional controllers 
studied previously. Hence, we expect it to be optimal for 
all precision e„i, although the fact must be considered 
that AiJ™g^ = —A* is only an approximation. In reality, 
not all points are controlled with the same parameter r 
for a given value of £„, as shown in Figure 6. Moreover, 
how e is calculated explicitly relies on the assumption 
that the distribution for X„ is uniform. This assump- 
tion has been verified numerically; yet, it must also be 
regarded as an approximation. Taken together, these two 
approximations may explain the observed deviations of e 
from its optimal value. 

For the Gaussian channel, optimality is also closely re- 
lated to our results about proportional controllers. The 
results of our simulations, for this type of channel, in- 
dicated that the normalized histogram of the controlled 
positions for X„ is very close to a normal distribution 
with mean x* and variance P. As a consequence, we 
now consider the variance P, which for Gaussian random 
variables is given by 

f,2H{X„) 



as the correlate of the size of the basin of control. For 
this quantity, the closed-loop optimality theorem with 
A/fciosed = yields 

P > (e^A* - l)N, (87) 

where N is the variance of the zero-mean Gaussian noise 
perturbing the sensor channel. 

In Figure 7, we have displayed our numerical data for 
P as a function of the noise power N. The solid line 
gives the optimal relationship which results from taking 
equality in the above expression, and from substituting 
the Lyapunov exponent associated with one of the four 
stabilized points listed in Table 1. From the plots of 



this figure, we verify again that P is lower bounded by 
the optimal value predicted analytically. However, now it 
can be seen that P deviates significantly from its optimal 
value, making clear that the OGY controller driven by 
the Gaussian noisy sensor channel is not optimal (except 
in the trivial limit where iV — > 0). This is in agreement 
with our proof that linear proportional controllers with 
Gaussian sensor channel are not optimal in general. On 
the plots of Fig. 7, it is quite remarkable to sec that the 
data points all converge to straight lines. This suggests 
that the mixing induced by the controller, the source of 
non-optimality, can be accounted for simply by modifying 
our inequality for P so as to obtain 

P = (e^^' - 1)N. (88) 

The new exponent A' can be interpreted as an effective 
Lyapunov exponent; its value is necessarily greater than 
A*, since the chaoticity properties of the controlled sys- 
tem are enhanced by the mixing effect of the controller. 

VII. CONCLUDING REMARKS 
A. Control and thermodynamics 

The reader familiar with thermodynamics may have 

noted a strong similarity between the functioning of a 
controller, when viewed as a device aimed at reducing 
the entropy of a system, and the thought experiment of 
Maxwell known as the Maxwell's demon paradox [27]. 
Such a similarity was already noted in the Introduction 
section of this work. In the case of Maxwell's demon, the 
system to be controlled or 'cooled' is a volume of gas; 
the entropy to be reduced is the equilibrium thermody- 
namic entropy of the gas; and the 'pieces' of information 
gathered by the controller (the demon) are the velocities 
of the atoms or molecules constituting the gas. When 
applied to this scheme, our result on closed-loop opti- 
mality can be translated into an absolute limit to the 



17 



ability of the demon, or any control devices, to convert 
heat to work. Indeed, consider a feedback controller op- 
erating in a cyclic fashion on a system in contact with a 
heat reservoir at temperature T. According to Clausius 
law of thermodynamic [54] , the amount of heat AQciosed 
extracted by the controller upon reducing the entropy of 
the controlled system by a concomitant amount AiJdosed 
must be such that 

AQciosed = (A:BTln2)Ai/ciosed. (89) 

In the above equation, fc^ is the Boltzmann constant 
which provides the necessary conversion between units 
of energy (Joule) and units of temperature (Kelvin) ; the 
constant In 2 arises because physicists usually prefer to 
express logarithms in base e. From the closed-loop opti- 
mality theorem, we then write 

AQciosed < (fcBrin2)[Aif-r„ + 7(X;C)] 

= Ag-- +(fcsTln2)/(X;(:7), (90) 

where AQ^J^fJ^^ = (/cBTln2)AiJ™|^. This limit should be 
compared with analogous results found by other authors 
on the subject of thermodynamic demons (sec, e.g., the 
articles reprinted in [27], and especially Szilard's analysis 
of Maxwell's demon [55] which contains many premoni- 
tory insights about the use of information in control.) 

It should be remarked that the connection between the 
problem of Maxwell's demon, thermodynamics, and con- 
trol is effective only to the extent that Clausius law pro- 
vides a link between entropy and the physically measur- 
able quantity that is energy. But, of course, the notion of 
entropy is a more general notion than what is implied by 
Clausius law; it can be defined in relation to several situ- 
ations which have no direct relationship whatsoever with 
physics (e.g., coding theory, rate distortion theory, deci- 
sion theory). This versatility of entropy is implicit here. 
Our results do not rely on thermodynamic principles, or 
even physical principles for that matter, to be true. They 
constitute valid results derived in the context of a general 
model of control processes whose precise nature is yet to 
be specified. 

B. Entropy and optimal control theory 

Consideration of entropy as a measure of dispersion 
and uncertainty led us to choose this quantity as a con- 
trol function of interest, but other information-theoretic 
quantities may well have been chosen instead if different 



control applications require so. From the point of view of 
optimal control theory, all that is required is to minimize 
a desired performance criterion (a cost or a Lyapunov 
function), such as the distance to a target point or the 
energy consumption, while achieving some desired dy- 
namic performance (stability) using a set of permissible 
controls [15, 38]. For example, one may be interested in 
maximizing A_ffciosed instead of minimizing this rjuantity 
if destabilization (anti-control) or mixing is an issue [56] . 
As other examples, let us mention the minimization of 
the relative entropy distance between the distribution of 
the state of a controlled system and some target distri- 
bution [57], the problem of coding [58], as well as the 
minimization of rate-like functions in decision or game 
theory [32, 59-64] . 



C. Future work 

Many questions pertaining to issues of information and 
control remain at present unanswered. We have consid- 
ered in this paper the first level of investigation of a much 
broader and definitive program of research aimed at pro- 
viding information-theoretic tools for the study of gen- 
eral control systems, such as those involving many inter- 
acting components, as well as controllers exploiting non- 
Markovian features of dynamics (e.g., memory, learning, 
and adaptation). In a sense, what we have studied can 
be compared with the memoryless channel of information 
theory; what is needed in the future is something like a 
control analog of network information theory. Work is 
ongoing along this direction. 
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